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AN  APPLICATIC-N  OF  THK  INVARIANP’?  PRINCIPLE  TO  THE 
OTUDENT  HYPOrriESIS 

By 

Paul  L.  Keyer 

1 . Introdactlon  and  Supimary . 

The  problem  which  is  pcaed  to  the  statistician  in  the  formulation  of 
the  general  decision  problem  as  outlined  ir  [l1  reduces  itself  essentially 
to  the  choice  of  a pure  or  randomized  decision  procedure  which  will  also 
bo  called  a statistical  strategy.  The  actual  choice  of  a procedure  depends 
on  the  criterion  employed  to  decide  how  the  risk  is  to  be  minimized . 

Various  criteria  for  selecting  a decision  procedure  from  a class  of 
possible  procedures  have  b«en  investigated,  with  most  attention  having 
been  given  to  the  minimax  and  Bayes  principle.  Unfortunately,  neither 
of  these  approaches  is  completely  satisfactoryj  the  former,  since  it  assumes 
without  must  justification  that  nature  — the  8tati.stioian' s "opponent"  — 
will  do  its  worst.  The  latter  requires  the  knowledge  of  some  a priori 
distribution,  which  is  often  not  available.  Hence  in  many  oases,  an  'optimal' 
solution  to  the  general  decision  problem  do»e  not  exist  and  we  must  accept 
somewhat  less  ambitious  aims.  This  is  analogous  to  the  classical  problem 
of  testing  hypotheses,  where  we  often  are  unable  to  find  uniformly  most 
powerful  teats  and  hence  restrict  ourselves  to  tests  satisfying  conditions 
such  as  unbiasedness,  similarity,  and  invariance. 

One  way  out  of  this  dilemma  is  to  construct  a class  of  procedures 
which  i.!  optimal  in  the  sense  that  no  matter  what  criterion  is  used,  one 
need  not  look  outside  it  for  selecting  a procedure  Such  a class  is 
called  essentially  complete.  It  is  often  desirable  to  restrict  oneself 
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further  to  a apodal  class  of  statistical  strategies  from  which  to  select 
an  essentially  complete  claa«.  This  is  done  in  order  to  reduce  the  number 
of  possible  procedures  to  be  cccsidered.  Also  this  often  leads  to  a 
simple  characterization  of  an  essentially  complete  class  within  this 
restricted  class. 

In  this  paper  we  construct  an  essentially  complete  class  of  invariant 
decision  procedures  for  a type  of  problem  arising  from  the  observation  of 
a normally  distributed  random  variable. 

More  specifically,  suppose  x is  normally  distributed  with  mean  ^ and 
variance  , both  unknown.  We  consider  the  problem  of  making  decisions 
concerning  the  quantity 


p ■ Prob(x  > 0) 


dy 


where  S " jj/d”*  assume  that  the  loss  involved  in  making  these  decisions 
is  a function  only  of  p end  hence  only  of  o , and  does  not  depend  on 
and  <T  Individually.  Decisions  concerning  the  quantity  p,  whether  they  be 
in  the  form  of  its  estimation  or  in  the  form  of  multi-decisions  will  be  referred 
to  as  decisions  of  the  Student  hypothesis  type. 

A special  case  of  the  above  problem  occurs  when  we  are  testing,  for 
example,  the  hypothesis  p*l/2  vs.  p>  l/2.  For  if  p*  l/2,  S and 
hence  jJ.  equals  zero.  Thus  the  above  becomes  a hypothesis  involving  the 
mean.  This  is  the  classical  problem  for  which  Student's  t-test  has  been 
shown  optimal. 

By  appealing  to  the  invariance  principle,  it  is  shown  here  that  the 
t-teet  is  optimal  for  the  more  general  situation  of  the  Student  hypothesis, 
given  above.  It  is  shown  that  ohese  optimal  procedures  are  monotone  in  t. 
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and  form  an  osaentially  oowplete  olaso  of  invariant  decision  procedures. 

Analogous  results  are  obtained  fr-^  the  multi-variate  case,  in  which  Hotelling's 
2 

T replaces  Student's  t. 

In  order  to  demonstrate  this  optlimlity  and  monotonlclty  of  the  t-test 
for  the  general  Student  hypothesis,  we  appeal  to  a theorem  proved  by 
H.  Rubin  [7l.  This  theorem  characterises  an  essentially  complete  class  of 
procedures  as  the  class  of  monotone  procedures.  To  prove  this  theorem, 
certain  assumptions  concerning  the  loss  function  and  the  action  space  are 
made.  Also  it  is  supposed  that  a real-valued  random  variable,  depending  on 
a real-valued  parameter  is  observed;  furthermore  we  assume  that  the  distribution 
of  this  random  variable  has  a monotone  llkel5hood  ratio. 

ITe  are  able  to  reduce  the  general  Student  hypothesis  type  problem  to  the 
observation  of  a non-central  t or  non-central  F variable,  by  restricting 
ourselves  to  invariant  procedures.  We  show  in  this  paper  that  both  of  these 
distributions  possess  monotone  llhullhood  ratios,  and  hence  the  above  theorem 
is  applicable. 

The  mathematical  formulation  of  the  problem  will  be  given  on  a much 
more  general  level  than  would  be  required  to  discuss  the  specific  problems 
treated  in  this  paper,  and  will  follow  quite  closely  e paper  bj'  Blackwell, 
Cflrshlck,  and  Rubin  [2l  on  the  invariance  principle.  The  reason  for 
introducing  the  machinery  given  in  [21  is  to  put  the  classical  problems 
with  the  Student  hypothesis  into  this  new  and  more  embracing  framework. 

In  many  statistical  problems,  we  are  concerned  with  a normally  distributed 

2 

random  variable,  with  mean  jj.  and  variance  (f  , both  unknown,  where  the 
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consequences  of  tho  action  taken  based  on  a random  sample  depend  only  on 
the  proportion  of  the  area  under  this  normal  curve  exceeding  (or  falling 
short  of)  a given  number  A.  For  example,  In  Industrial  applications,  the 
quality  of  a lot  of  goods  may  be  measured  by  the  fraction  of  the  material 
that  exceeds  a given  limit  A,  with  respect  to  some  normally  distributed 
characteristic.  On  the  basis  of  a sample  of  observations  we  may  wish  to 
either  estimate  the  proportion  or  make  decisions  as  to  the  disposition  of 
the  lot. 

More  specifically,  we  consider  the  problem  of  making  decisions  concerning 
Prob(x  > a) 


Taking  A*0,  without  loss  of  generality,  and  letting  y*x/CT,  we  obtain 


00 

p - Prob(x>0)  - f 


dy 


where  S " assume  that  the  loss  involved  in  making  these  decisions 

is  a function  only  of  p,  and  hence  only  of  5^,  and  does  not  depend  on  yu 
and  C~  individually. 

The  classical  problem  for  which  Student's  t Is  used,  namely  constructing 
significance  tests  for  the  mean  of  a normally  distributed  random  variable, 

Is  clearly  a spociul  case  of  the  above.  For  If  p*  l/2,  $ and  hence  ^ 
equals  zero,  and  tho  testing  of  p“  l/2  vs.  R^:  p/  2 becomes  a test 
involving  the  mean.  In  order  to  prove  optimality  of  the  t-test,  oven  in 
this  special  case,  we  assume  that  the  consequence  of  a wrong  decision  depends 
on  S and  not  on  jX  and  <T  individually. 

Various  optimal  properties  for  the  teat  based  on  Student's  t have  been 
demonstrated  for  this  special  case.  For  instance,  we  know  that  if  we  teat 
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(1)  ^ - 0 

Hi*.  |jl  > 0 , 

the  u.m.p.  test  Is 

(p(t)  - 1 if  t < c 

Cy(t)  "0  if  t > c , 

where  (p(t)-Prob.  accepting  given  t,  or 
(ii)  H^!  yi  - 0 

^ 0 

for  which  no  u.m.p.  test  exists,  but  the  u.m.p.  unbiased  test  is 
(p(t)  “1  if  |tl  < k 

Cp(t)  - 0 if  |t  I ^ k . 

When  we  restrict  ourselves  to  invariant  procedures  for  testing  the  above 
hjpotheais,  it  has  been  shown  that  the  t-test  fields  the  best  invariant 
procedure.  Recently  the  admissibility  of  this  best  invariant  prccedi.re 
has  been  shown  by  Lehmann  end  Stein  [4''>  The  power  function,  which  was 
used  as  a criterion  for  loss  is  a function  of  only  and  hence  is  of  ths 
type  considered. 

In  this  paper  we  show,  that  by  appealing  to  the  invariance  principle, 
optimal  properties  of  the  t-test  hold  in  the  general  problem  of  the  Student 
hypothesis,  as  given  above.  We  may,  for  Instance,  partition  the  ^ -axis 
into  k intervals  (non-overlapping)  I^, and  test  the  hypotheses 

S€  , l“l,...,k,  l.e.,  consider  a multi- dec  is  ion  problem.  If  we  then 
assume  that  for  a fixed  decision,  the  loss  function  depends  on  S and  possesses 
certain  monotonicity  properties,  we  can  by  restricting  ourselves  to  invariant 
procedures,  find  a strategy  based  on  t which  is  uniformly  gs  good  as  or  better 


than  any  other  given  invariant  procedure.  Furthermore,  a constructive  method 
for  finding  such  a better  procedure  is  given  in  Cll.  These  procedures  based 
on  t,  will  turn  out  to  be  monotone.  When  only  2 actions  are  involved,  the 
essentially  complete  class  of  invariant  procodures  (l.e.,  the  monotone 
procedures  based  on  t)  Is  minimal.  We  may  also  wish  to  estimate  the  quantity  p 
Here  again,  the  invariant  estimate  which  is  optimum  will  be  a monotone 
function  of  t. 

In  considering  invariant  procedures,  many  others  besides  t suggest 
themselves.  For  example,  the  statistic  H-x/R,  where  R is  the  sample  range 
is  commcr'ly  used  in  quality  control  applications.  It  was  shown  in  [81, 
that  for  n£  10,  H yields  an  excellent  approximation  to  t in  terms  of  the 
power  function.  Naturally,  if  we  take  computational  and  time  costs  into 
consideration  it  may  well  be  that  for  small  samples  the  use  of  H is  better 
than  that  of  t.  Ignoring  these  costs,  the  above  discussion  shows  that 
procedures  based  on  t dominate  those  ba.'ied  on  H. 

3 . Mathematical  Formulation  and  Definitions . 

In  this  section  we  shall  discuss  the  mathematical  framework  within 
which  our  results  will  be  stated.  Also  we  shall  define  rigorously  various 
concepts  which  have  been  referred  to  rather  loosely  in  the  preceding  sections . 

Although  most  of  the  concepts  arc  discussed  in  great  detail  elsewhere, 
as,  for  example,  in  [I**  and  [21,  some  of  the  basic  ideas  will  be  recapitulated 
here  for  the  sake  of  completeness. 

We  assume  that  we  are  given  the  following: 

(l)  A sample  of  space  ^ ~ {7,, (3, Cl,?)  where  Z is  the  space  of  outcomes 
of  an  experiment,  ^ a Borel  field  of  subsets  of  Z,  O.  an  arbitrary 
parcueter  space,  and  P a function  defined  on  (J^xO,  so  that  for 
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(2) 

(3) 


(4) 


(5) 


each  cJcCl,  i®  a probability  measure  on  (3.  ffe  shall  write, 
for  SG(3,  6u>eO,  Pj:s)- P(Sloj). 

4n  action  apace  A and  Bor el  field  Q.  of  subsets  of  A. 

A loss  function  L defined  on  ClxA  which  is  Cl -measurable  for 
each  cl)£C1  . For  any  cotQ , a£A,  L(o3,a)  represents  the 
loss  to  the  statistician  if  nature  is  in  state  60  and  he  chooses 
action  a.  We  may  assume  L(cO,a)2:0. 

A class  of  randomized  decision  functions  ^ so  that  for  each 

a€  Z,  is  a probability  measure  on  d.  In  particular, 
z 

for  Z€.CLf  z€Z,  -j/(E)  is  the  probability  of  taking  action  E, 

z 

on  observing  z. 

■’’n  most  of  our  applications  wo  shall  only  need  to  consider 
the  class  D of  pure  deois5.on  procedures,  where  each  d maps  Z 
into  A. 

A statistical  game  G»  (Q,>V^,^)  where  p is  the  risk  function 
defined  by 


p(60,  -V) 


L(cO,a)d-V(a(z)dP(z|co) 


In  term]  of  these  concepts,  we  now  define: 

A class  C of  decision  procedures  is  essentially  complete  if  for  any 
procedure  V',  there  exists  a procedure  ^60  so  that  p(tO,'V')  <,  p(cO,  1^0 
for  all  coeQ.  If  for  any  'U'^Q  we  can  find  V6  C so  that  p(60,  -l^)  < 
p(co,  for  all  60,  with  strict  inequality  for  at  least  one  oO,  then  we 
say  C is  complete. 

Clearly  the  construction  of  a complete  class  C is  extremely  desirable, 
since  we  need  not  look  outside  it  for  seleclnt  procedures  for  a particular 
statistical  game. 
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Next  we  Introduce  the  invaritinoe  principle  as  we  shall  use  it  in  cur 
applicaticna.  This  principle  hcs  been  used  to  advantage  in  various  statis- 
tical problems,  as  la  mentioned  in  a paper  by  E.  Lehmann  [5^.  Tho  formulation 
given  here  following  the  previously  mentioned  paper  [2],  la  different  from 
the  earlier  ones  in  that  it  considers  invariance  from  the  more  general  pcint 
of  view  of  decision  theory. 

Let  be  a group.  With  each  g€  ^ , are  associated  functions  g^, 
g^,  and  g^,  defined  on  Z,  0-,  and  A respectively,  so  that  g-->g2  ® 

homomorphism  of  ^ into  the  transformation  group  on  Z,  with  similar  inter- 
pretations for  the  correspondence  g — ^^and  g— l.e.,  with  each  element 
of  the  group  is  associated  in  a 1-1  way  an  element  which  maps  Z onto  Itself, 

A onto  Itself,  and  Cl  onto  Itself. 

We  shall  only  deal  with  a particular  type  of  group,  namely  admissible, 
which  we  now  define. 

Definition.  The  group  and  Its  associated  functions  g^,  and  gj^ 
are  admissible  with  respect  to  the  game  G"  (0;D,|0  ) if: 

(l)  for  each  gev5'»  g2  *nd  g^  are  measurable  with  respect  to  C8 
and  ^ respectively; 

(li)  for  each  g€  , ScCi?,  , 

P(g2(S)|^(cj))  = P(Slco); 

(ill)  for  each  g € , a 6 A,  cofeO. 

L(gQ,(<'0),gA(a))  - L(co,a). 

We  assume  in  what  follows,  that  only  admissible  groups  are  considered. 

The  purpose  of  introducing  invariance  at  all  was  In  order  to  reduce 


the  number  of  possible  procedures  which  might  have  tc  be  considered,  and 
restrict  ourselves  only  to  Invariant  procedures,  which  we  now  define. 
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Definition.  A pure  decision  function  d€  D is  invariant  under  ^ if 
for  all  gC  S',  z Z 

Aig^i'z))  » g^(d(z)) 

A randomized  decision  function  is  said  to  be  invariant  under  , 

if  for  every  z€Z,  E&CL,  and  ge.&, 

:J^(gj^(K)|g2(a))  - -V(e/z) 

These  definitions  simply  say  that  a decision  procedure  Is  invariant 
if  the  aame  action  results  from  observing  z after  it  has  been  operated 
on  by  e.g.,  observing  g^i^) . 

We  shall  next  define  another  concept  that  arises  in  studying  invariant 
procedures,  namely  that  of  an  orbit.  This  is  very  closely  related  to  the 
more  familiar  concept  of  a maximal  invariant,  as  we  shall  point  out  below. 

D^eflnltlon.  Let  2^  ^Z  be  a fixed  element.  Then  we  call 
- .TzJ  Z"g2(*p)  for  some  the  orbit  generated  by  z^. 

As  we  vary  z^  over  Z,  we  obtain  the  class  7^ of  orbits  on  Z;  this  is 
clearly  a partition  of  Z,  e.g.,  if  we  define  z^-^Z2  for  Z2  in  the  same 
orbit,  then  defines  an  equivalence  relation  over  Z.  To  tie  this  in 
with  the  concept  of  a maximal  invariant,  we  use  the  following  definition. 

Definition.  A function  f defined  on  Z is  a maximal  invariant  if 
(i)  fCg^Cz))  - f(z)  for  all  gC.ff , z€Z; 

(11)  ^(^2)  “ f(z^)=^  there  exists  a gsS"  so  that  Z2  ” g2(z^). 

It  is  clear  that  a maximal  invariant  is  constant  on  each  orbit  and 
assumes  different  constant  values  on  different  orbits.  There  may  bo  many 
functions  f which  are  maximal  invariants  for  a group  ,5'' ; each,  however. 
Induces  the  same  partition  on  Z,  namely  the  class  of  orbits.  Hence  we  may 
identify  orbits  and  maximal  invariants,  as  we  shall  do  later. 
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Needless  to  say,  the  same  concepts  discussed  above  for  Z,  apply  also 
to  ^ . 

We  shall  non  summax^ze  a fen  cf  the  main  results  as  set  forth  in  [2]. 
Before  getting  involved  in  a maze  of  notation,  we  shall  briefly  outline  what 
we  are  attempting  to  do. 

Wo  start  out  with  a statistical  game  G"  (Q,^,^),  where  both  the 
spaces  Z and  O may  be  of  rather  complicated  form.  Wo  introduce  an 
equivalent  game  G (for  the  definition  of  equivalence  of  games,  see  Til), 
in  which  nature  chooses  an  orbit  of  O.  and  an  element  g€:o'’>  and  the 
statistician  observes  an  orbit  K.  If  we  restrict  ourselves  to  1 .variant 
procedures  (with  respect  to  an  admissible  group),  it  turns  out  that  the 
risk  in  the  equivalent  game  G*  does  not  depend  on  the  choice  of  g€^. 

We  now  give  a brief  resum^  of  the  mathematics  Involved  in  the  above 
reduction.  We  a.'.sume  given  an  admissible  group  ^ and  a statistical  game 
G- 

(l)  Consider  the  class  of  orbits  on^^,  say i.e.,  implies 

©■^co:  for  some  g, 

Now  fix  any®€'^  and  let  4*  be  a selection  function  defined  on  and 
taking  values  in  (m),  i.e.,  for  each©,  chooses  a point  Define 

P*(S|g,@)  - P(S|^(4^®)) 

L*(g,(Sia)  - L(^M^(H)),a)  , 

then 

G - (Q,y,^)^G*  - (n*,y,p*) 

where 


and 
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~ fj  L*(g,©,a)d  ■v'(alz)dP*(z  |g,(H?/  . 


Z A 


(ll)  Assume 

{g:  gz(*o^  " 

where  e is  the  identity  element  of  Then  there  exists  a 1-1  function 

X : ^—*K,  defined  by  ^(g)"g^(z  ),  for  z fixed  eK.  Hence  there  is 

o o 

induced  a probability  distribution  over  since  a distribution  is 

* 

assumed  to  exist  over  K.  Also,  if  v is  an  invariant  procedure,  ^ is 
independent  of  the  choice  of  g and  we  may  as  well  use  e.  Thus  we  can 
write 

P*(e,(5),  ~ f fj L*(e,(R^g^(a))d  rjj(g(©)drj(a|K)dQ(K  I®) 

where  is  a probability  distribution  over  A for  given  K,  and  dQ  is  a 
probability  distribution  over  K for  given©. 

(lii)  In  the  special  case  in  which  g^{a)<‘  a the  expression  for  p 
in  (ii)  simplifies  to 

* f J'L*(e,@,a)d7^  (a|K)dQ(K  1©) 

©A 

This  reduction  is  fiossible  since  the  integration  over  5^  was  eliminated  as 

dn^(g)-i 

It  is  this  form  which  will  occur  in  moat  of  our  applications. 

(iv)  Suppose  that  f and  A are  jaxiraal  invariants  on  Z and  O.  n 'pec- 
tively.  Then  we  may  write  (making  appropriate  notational  changes  since 
originally  the  functions  involved  were  defined  for  different  arguments): 

P*(e,  J J L*(e,  a ,a)d->|  (alf)dQ(f  I a ) 


/ 


12  - 


4.  Shara^terlzatlon  of  an  ! 

We  shall  now  state  the  main  theorem  used  In  obtaining  essentially 
complete  classes  of  decision  procedures.  It  is  a theorem  proved  by 
H.  Rubin  [71. 

Before  stating  the  main  result,  we  Introduce  a few  more  concepts. 

Definition.  Let  Z be  a real-valued  random  variable  and  let  A,  the 
action  space,  be  a closed  subset  of  the  real  line.  Then  a monotone  proce- 
dure d:  Z — yA  la  defined  by: 
x,yeZ  , X > y 
d(x)  - a^  , d(y)  - 


==^  ®1  ^ ®2 


If  A la  finite,  e.g..  A-  (sj^, . . . ,aj^) , a monotone  procedure  is  characterized 
by  a aet  of  numbers  x^^  x^4  . . . 5,  Xj^  so  that  action  1 is  taken  If  and  only 
if  the  outcome  is  a point  in  (See^  for  example,  [11,  Chapter  7.) 

Definition.  Let  Z be  a real-valued  random  variable.  Suppose  the 
probability  distribution  of  Z,  depends  on  a real-valued  parameter  CJ. 

Then  is  said  to  have  a mo.notono  likelihood  ratio  If  for  z^>  z^, 

^1-  have: 

p(z^|  C0j^)p(z2K)2)  > p(Zj^|c02)p(z2lc*-»^) 

As  we  have  mentioned  In  some  of  the  Introductory  sections  of  this 
paper,  we  shall  restrict  ourselves  to  a special  class  of  loss  fujictlons, 
which  we  now  describe. 

Let  A be  a closed  subset  of  the  real  line,  and  assume  the  parameter 
space  to  be  an  Interval,  say  (a,b). 

Suppose  Inf  L(co,a)  la  assumed;  the  point  at  which  It  Is  assumed 
a€  A 

clearly  depends  on  cO  and  we  denote  it  by  q(cJ) . It  is  obvious  that  q(o_>) 
need  not  be  unique,  since  there  may  be  a whole  set  of  values  of  co  which 
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yields  inf  L(co,a).  We  simply  let  q(oj)  be  a point  at  which  this  l"f  is 
assumed. 

We  further  suppose 

(i)  q.iu>)  is  increasing  in  oo, 

(ii)  a s a-'  i q(co)^ 

=>L(co.,a  ) i L(co,a)  for  all  cn. 


q(u5) 


a'  i a J 


If  k is  finite,  say  A"  (a^, . . . ,a^^) , this  reduces  to  being  able  to 

label  a,,..., a.  so  that  O."  (a,b)  can  be  subdivided  into  k consecutive 

k 

subintervals  (some  of  which  may  be  empty)  with  (Jl.  * -O.,  so 

that 


(i) 

(li) 


for  all  oo€l,,  L(co,l)  ■ min  L(do,j) 


S i J S 1*) 
1 i J i S J 


=b^^L(cO,J)  s L(tc),s)  for  all  co€lj^  . 


We  can  now  state  the  main  theorem  of  this  section. 

Theorem.  Let  Z be  a real-valued  random  variable,  O.  a subset  of  the 
real  line,  and  A a closed  subset  of  the  real  line.  Suppose  that  ]^,  the 
distribution  of  Z,  depends  on  w>  in  such  a way  that  it  has  a monotone  like- 
lihood ratio.  Suppose  further  that  the  loss  L(cO,a)  satisfies  the  conditions 
stipulated  above.  Then  an  essentially  complete  class  of  decision  procedures 
can  be  characterized  as  the  class  of  monotone  procedures. 

This  theorem  is  proved  in  [11  for  the  class  of  exponential  distributions 
and  finite  A.  The  extension  to  the  above  form  is  contained  in  an  unpublished 
paper  by  Rubin. 

In  order  to  apply  this  theorem  to  the  construction  of  essentially 


complete  classes,  we  must  reduce  the  problem  to  the  observation  of  a real- 
valued  random  variable  having  a distribution  with  a monotone  likelihood 


- u - 


ratio  (supposing  the  other  aasumptiona  are  fulfilled).  It  ia  for  thia 
reduction  that  we  make  uae  of  invariance,  e.g.,  raatricting  ouraelvea  to 
invariant  procedures,  and  show  that  in  certain  cases  we  can  reduce  the 
problem  to  the  form  for  which  the  above  theorem  holds . 

To  do  thia  for  the  specific  case  where  we  deal  with  observations 
obtained  from  a univariate  or  multi-variate  normal  populations  we  need  two 
preliminary  results,  which  will  be  developed  in  the  next  section. 


5.  Manotanic.lty_fi>jL . tbfi.  Ngn- gfinitrj.l  t anA-Ngpicgc^rfll  ? DL?t^rtb«Ug.D. 

We  first  consider  the  case  of  the  non-central  t-distribution.  Suppose 

z and  w are  Independently  distributed  random  variables,  z being  N(  S 
2 

and  w being  Then  the  joint  distribution  of  z and  w is 


f(z,w)  - 


,1 


(Z-  S )' 


i£:2 
. 2 


-w/2 


Letting  t“  z/ 1\  w/k', 

p(tl  S ) “ 


we  obtain  for  the  distribution  of  t: 
00  k- 

c J t-  6 w ^ dw 

0 


We  now  shall  prove: 

Theorem.  The  non-central  t distribution  p(t 1 5)  has  a monotone 
likelihood  ratio. 

Proof . Thfl  proof  of  this  theorem  has  been  given  in  [3"'  and  [41 5 the 
proof  given  here  is  a slight  modification  of  the  one  found  in  [4^. 

Let 

p(t  1 S, ) 

' p(tl  I]) 


where  > 62*  ""1®^  show  that  t^^  > t2  (t^)  > ^"(t2)*  Since  F is 

continuous  in  t,  only  2 cases  arise; 


- 15  - 


(i)  t2  < < 0 

(ii)  > 0 


We  shall  prove  (i);  e.g.,  suppose  t < 0,  Let  -tA[iT  - v.  Then  one  obtains 


F(t)  - 


00  1 / V . r \2  - 


f (v)dv 


where 


«■  . Ifz  . ji, 

J e e f(v)dv 


c ■ e 


-|(8?-S^) 


Hence 


f(v)  - e“  2k 


F"(n  - 


go 

J F(v)e'  ^ 


where 


Noti  aa  t<  0 Increases,  V decreases,  and  so  d Increases.  Hence  we  must 

♦ . ^ 
show  that  F is  increasing  in  o . Let  0<  O2  define 

A - ^ fF*(  ^1^'’ 


Therefore 


j f(T)e  ^ ^ dT  J f(v)e  ^ ^ dT 


^ ^ 


Zfs 

W ■ V 


00  V 


■ -‘,T 


where 


«!<->  ■ 7 

QD  ^ ^ _«• 


f(z)< 


However , 


2 ^^2 


J [g2(v)-g2^(v)  ■’dv  - ~ 


OD  -U  ^ 

J tf(v);’^''  ■ TP 


J'  f(z)e 


_®2  V 2 

YF  ■*  ^2^ 


d 2 "2 

” W ^- 

^^2  y,  2 

, 


0 


Also, 


g2(v)  -C2J2" 

— 7 — T “ c''  e 

g^Cv) 


, c'  > 0 
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Therefore,  v Increaaing  implies  decreasing.  Hence  by  continuity 

argument,  we  can  find  an  M so  that 


gp(v)  >1  if  0 V i M 

®1^^'  <1  ifM<v<oo 

Thus  applying  the  mean  value  theorem,  we  may  write: 

S Lgo(v 


A ■ 1 e 


"TP 


g2(v)-gi(v)1dv 


3 ^ j [g2(v)-gj^(v)  Mv  *•  e ^ J [g2(v)-gj^(v)1dv 


where 


0 < V < M 
o 

M < < 00 


Hence 


V V 

A > e ^ ^ ^ J * e ^ ^ ^ f 

0 U 

- A ■"  / 


J Cg2(v)-gj^(v)  Mv  - 0 


This  proves  our  oontantion. 

The  proof  for  (ii):  tj^  •>  t2  > 0 i^  except  for  a few  minor  changes,  the 

same . 


We  shall  now  derive  a similar  property  for  the  non-central  F distribution. 

Theorem.  Let  G“  u/v  where  u and  v are  independently  distributed 
according  to  the  non- central  'y^  distribution  with  r d.f.  and  ^ distributions 
with  s d.f.  respectively.  Then  the  probability  distribution  of  G,  p(G j X ) 
has  a monotone  likelihood  ratio,  where  A is  the  non- centrality  parameter. 

The  distribution  p(GlA)  is  called  the  non-central  F distribution. 
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Proof.  The  expression  for  p(GlA)  is  given  by 
^ LLZar-Z.  Sll  , 

2 (J_)  2 V 

m"  ^ 


p(G  I A ) 


where 


s \ 1-1 


t - [B(§  . )I 


Letting  JTq  > which  is  clearly  an  increasing  function  of  G,  we  havet 


a*2  r-2 

p(G|  A)  “ p*(u|  A ) - (l-u)  ^ u ^ 


m*  0 


For  > X2» 


P (ul  A,) 

F(u)  - -i ^ 

p (u  I ^2) 


Hence 


F(u) 


»4o 

k(m) 

ra“  0 


orjCu) 

rJuT 


say. 


Differentiating  F with  respect  to  u yields: 
crF.(u)Fl(u)-FT(u)P'(u)l 


F^(u)  - 


{?2(u?y 


The  functions  F^  may  be  differentiated  termwise  and  one  obtains: 

F^(a)  ■ ^ n ^ a"-H(n)  . 

n-  6 


Thus 


00  00 


Fj(u)Fi(u)-P,(u)F,'(u)  - ^ 

ra “ 0 n«  0 


kiisl 


m m! 


where 
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We  may  write  the  product  of  the  above  aeries  in  terms  of  the  double  series 


indicated,  since  we  are  dealing  with  uniformly  convergent  series  of  positive 


terras.  Now  the  above  double  aeries  may  be  written  as  (omitting  the  arguments): 


m“0n=0  m“0n”0  m*0n*m 


since  for  m*  n,  the  argument  is  zero. 


More  generally,  we  may  write  (if  Interchange  of  summation  is  permissible, 


as  it  is  in  our  case): 


00  00 


00  00 


y ' f(m,n)  - 

m*  0 n*  (5  n"  0 ra«  n 


00  00 

- ^ ^ f(t-n,n) 

n • 0 i-  (5 


00  00 

f(m,n)  ■ \ * 3"*  f(m,t+ra) 


m«  u n“  m 


IB*  0 t 


Using  these  results  in  the  above  expression,  we  have: 


F2(u)F^(u)-F^(u)F2(u) 


t+2n-lr  \ t*n  V n vt+nvOn 
”®Vn“n“  '^^2  ^1”  ^2^ 


*t^n  n 


n ■ 0 t*  0 


00  00 


m * 0 t* 


■ Z ^ ASH-")t  A?  Af X\  Af" l} 

n - 0 m*  0 


n*  0 m*  0 
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which  proves  the  assertion. 


6 . Appll£6;y.gna-1a-tiaEffiflIly.Dlatrj.bij,^d_Vasl^ea . 

We  shall  now  apply  some  of  the  principles  of  the  previous  sections  to 
problems  arising  in  inference  based  on  observations  from  normally  distributed 
random  variables.  In  particular,  we  shall  study  the  t-test  in  the  light  of 
the  discussion  carried  on  earlier. 


Let  X. , . . . ,x»  be  N Independent  observations  from  a random  variable 

with  distribution  N(  u. , <T^).  Since  the  statistics  x“  5 y ^ Xj  and 

2^-2  2 ’-^ 

B - > ' (xj^-x)  are  jointly  sufficient  for  (a,  (T  , we  may  consider  Z, 

the  space  of  outcomes  to  consist  of  the  pairs  -^(x,s),  s > 0^  while  the 

parameter  space  O i.s  representable  as  -||^(|ji,£r),  (j">0^.  Defining 

t*x/j  ind  S • (T,  we  note  that  we  may  equivalently  write  for  zeZ, 

z*  (ts,s),  and  for  c^^O,  CO--  (S(T,(r). 

We  note  that  the  sufficiency  principle  has  made  it  possible  to  reduce 

our  observation  (x^,...,Xjj)  from  a point  in  N-dimensional  space  to  one  in 

2-dimensional  space.  Using  the  invariance  principle,  we  shall  further  reduce 

this  to  the  observation  of  a real-valued  random  variable,  in  order  to  apply 

the  above  theory  on  complete  classes. 

We  have  for  the  distribution  of  z - (x,s)! 


p (x,s)  ■ -i-jt-,-  e 


N /-  .^2 

2 (x-pA.) 


N-1 


N-2  2(f 

8 6 


, for  8 > 0 . 


^ (T*r(¥) 


Now  t' “ ^ N(N-1)'  ^ has  the  non-central  t distribution,  say  p(t'' I S),  with 
non- centrality  param  oter  N 'S  „ 
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We  first  consider  the  group 
g^(x,a)  " gx  , gs 

g^(  |j , <r ) - g p.  , g d~  g > 0 

g^(a)  “ a 

This  type  of  group  operates,  for  example,  when  data  Is  subjected  to 
change  of  soule.  When  dealing  with  we  assume  the  loss  L to  be  of  the 
form 

L(60,a)  - L((  S ,l)a)  , 

i.e.,  L depends  on  oj  only  through  S.  As  we  hbve  stated  before,  this 
occurs,  for  instence,  in  the  classical  case  of  testing  hypotheses,  where 
the  loss  is  measured  in  terms  of  the  power  of  the  test. 

We  now  consider  various  problems  which  remain  invariant  with  respect 
to 

It  is  easily  checked  that  is  admissible  group  for  G* 

where 

JJ  L(  co,a)d“>'(a  jz)dP(z|  co) 

The  orbits  for  this  group  are  easily  determined. 

Fixing  ( S >1)€0.  we  obtain  for  the  orbit  (S)  “-Tcos  S^-cV, 

Where  c is  a constant.  Geometrically,  this  is  a ray  through  the  origin  in 
the  (T  pianos 

(T 
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Thus  the  class  of  orbits  /l9,  partltlo’is  Oi  with  the  equivalence 
relation: 

(Ti) (T2)  - 

For  fixed  (S)tS , we  take  as  our  selection  function  ^ that  function 
which  selects  the  point  In  (H)  for  which  (T“  1* 

Again  the  geometric  Interpretation  is  clear: 


Similarly  in  the  Z-space,  fixing  * (t,l),  the  orbit  generated  by  it  is 
■ '|^(x,s)  J t“  0^  which  again  represents  a ray  through  the  origin  in 
the  (x,s)  plane. 

It  is  clear  that  t and  h represent  maximal  invariants  in  Z and  ITI 
respectively.  We  shall  make  use  of  this  fact  below,  as  we  shall  identify 
orbits  and  maximal  invsriants. 

We  now  use  some  of  the  results  stated  in  Section  3 above,  based  on 
the  invariance  principle.  Consider  the  game  G ■ ) which  was 

shown  to  be  equivalent  to  G.  By  restricting  ourselves  to  invariant  proce- 
dures and  making  use  of  the  simplification  yielded  by  gj^(a)“  8>  had 
obtained: 


p (e,CB),  ?|)  - J J L((5  ,i),a)d>|jj(a)dQ(K|(H)) 


X A 


- 23  - 


Now  identifying  the  orbits  K and  ^)wlth  the  maximal  invariants  t and  S , 


we  may  consider  the  risk  p , say,  where 

j J L((  ^,l),a)d^^(ftlt)dP(tl6)  . 

t»-  00  A 

From  this  form  of  the  risk  function,  it  la  clear  that  we  have  reduced  the 
problem  to  the  case  where  we  are  considering  a statistical  game  in  which 
the  statistician  observes  a "t"  — e>g»t  c reel-valued  random  variable, 
and  nature  chooses  a "8",  again  a real-valued  parameter.  Since  p(t^l  S ) 


has  a .aonotone  likelihood  ratio  (where  t~ 


t^  ) , we  can  conclude 


/^N(N-l)’ 

that  if  L satisfies  the  conditions  set  forth  in  the  theorem,  an  essentially 
complete  class  of  invariant  procedures  is  the  class  of  monotone  procedures 
in  terms  of  the  t statistic.  We  consider  some  examples. 

(l)  Suppose  we  want  to  test  the  hypotheses: 


H:  ^x<0 


vs. 


“l*  fj-  ^ ° 


Here  A*  (a^,aj^) , a.g.,  accept  or 


Our  above  result  tells  us  that  the  olas.'tlcal  procedure: 


Take  action 


if 

if 


V < t. 


t > t 


which  is  clearly  a monotone  procedure,  does  form  an  essentially  complete 
class . 

(2)  We  can  consider  an  estimation  problem,  where  we  have  N Independent 

2 

observations,  Xj^,...,Xjj,  form  a population  with  distribution  N(p.,<T  ). 

It  is  required  to  obtain  an  estimate  of  the  quantity 

2 


27T  cr 


00 

/ 


1 (i=. 

2 


dx 
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Letting  , wo  obtain 


Henoo  p depends  on  U>  only  through  S , Here  the  aotlon  space  A Is  the 
closed  Interval  [0,l3.  If  se  take  the  loss  function  of  the  form 
L(co,a)  ■ [^p3^  where  $ Is  the  estimate  and  f(p)  >0  for  all  p,  then 

our  group  Is  again  admissible  and  we  can  conclude  that  an  essentially 
complete  class  of  Invariant  procedures  consists  of  estimates  p which  are 
monotone  functions  of  t. 

(3)  He  consider  a slight  generalisation  of  Exr  \ple  (l). 

L.t  - ( 

a partition  of  the  S-axls  Into  subintervals.  He  want  to  test  S 6 Ij, 
j*l,...,k.  If  L((1,6  ),1)  Is  of  the  form  described  earlier,  then  we  again 
have  the  .result  that  an  essentially  complete  class  of  procedures  Is  given  by 
the  procedures  which  are  monotone. 

Clearly  (2)  and  (3)  are  examples  of  the  problem  considered  under  the 
Student  Hypothesis,  e.g.,  decisions  concerning  the  proportion  of  the  popu- 
lation falling  beyond  or  below  a certain  value,  or  what  Is  equivalent, 
decisions  concerning  ; example  (l)  Is  the  classical  problem,  and  Is  a 

special  case  of  the  others. 

He  shall  next  consider  another  group,  and  an  application  connected 
with  It. 

Consider  the  group  where 
g^Cxjs)  - gx  , |gfs 

(T)  - ^|JL  , |g|(T 
gj^(a)  " a 


5j»  + J ■ l,...,k  where  ($25  ‘-‘S  ^^+1  ocastltute 


9 


- 00  < g < CD 


» g / 0 . 
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I 

I 


El 


I 


Assuffie  that  when  opciratlng  with  this  group,  the  loae  L la  of  the  form 
L(o>,a)  - L(iy|,a). 

Foe  thla  group,  a maximal  Invariant  on  z aiiMl  O.  reapeotlvaly,  are 


{ 


and  l-=r 


ii 

i(T 


2I(x^-x)^ 

The  gaometrlo  representation  of  the  orblta  In  Z and  Xl  are  the  ' reflected' 
ray  a |t  | ■ o and  I S | ■ o , 1 . e . , 


^ which  has  (properly  normalised)  ths  ncn-oentral  P distrl- 

Z_(xj^-i) 

bution.  Hence  we  may  apply  the  monotoniclty  property  of  this  distribution 
to  construct  essentially  complete  classes  of  invariant  procedures  with 
respect  to  this  group. 

So  far  we  have  considered  examples  for  which  we  assumed  g^(a)*  a; 

this  we  recall  leads  to  considerable  simplification  in  the  expression 

* 

for  ^ , and  makes  it  possible,  in  certain  cases,  to  apply  the  complete 
class  theorem  directly.  Let  us  now  discuss  an  example  in  which  the  situation 
is  more  complex.  Consider  the  group  defined  by 
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- gx  , |g(a 

g^(  |A,  <r)  - g |JL.  , g (T 

g^(a)  - a if  g > 0 

« 1-a  if  g < 0 . 

Thia  type  of  tranaformation  ia  of  intereat  whan  we  consider  the  eatimation 
problem. 

He  define  two  aubgroupa  of 

g > oj 

>^2*  g - •*  1 j'  } 

we  note  that  ia  the  same  aa  conaidered  before.  He  proceed  aa 
followa  to  oonatruot  an  aaaentially  complete  class  of  decision  procedures. 

Let  ^ be  any  invariant  deciaica  procedure  (with  reapect  to 
then  Cf  is  invariant  with  respect  to  since  Since 

we  have  from  the  previous  example,  that  cp  is  a function  of  t. 

It  ia  proved  in  [7l  that  given  any  procedure  Cp,  defined  for  a roal- 
valued  random,  end  invariant  with  respect  to  (i>^*>  assume  to  be 
operating  on  t and  S,  changing  t t and  S— and  a— 
there  exists  a procedure  cp'',  invariant  and  monotone,  ao  that  p{  8 ,cp' ) s 

p(8,cp),  all  8. 

Hence,  any  Cf  invariant  with  respect  to  is  invariant  with  respect 
to  and  being  a function  of  t,  we  may  apply  above  result  and 

obtain  the  result  that  for  the  class  of  invariant,  monotone  (in  t) 

procedures  forms  an  essentially  complete  class. 

We  shall  next  consider  the  following  application  to  the  multi-variate 


case. 
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Let  X be  a p-dlmenelonal  random  variable  distributed  aocording  to 

the  multi-variate  normal  lav  ^)>  ’»hore  jx  is  the  vector  of  expectations 

and  ^ the  covariance  matrix  (assumed  to  be  non-singular) . Let  x^».,.,x^ 

be  a sample  of  slita  n of  x and  let  z be  the  vector  of  sample  moans  and 

S*  Z'x-n5oc^y  where  X is  the  matrix  of  observations.  Since  F and  S are 

Jointly  sufficient  for  p.  and  ^ we  need  only  ocnaider  the  sample  space 

* (Z,  l^,Q,P)  where  2)-  (x,S).  Ife  wish  to  tost  the  hypotheses: 

H : U-  0 

o ' 

p.  ^ 0 

Assume  the  loss  L to  be  of  the  form  L(jU,^,a)  ■ L*(p'  Consider 

the  group  all  pzp  non-singular  matrices,  and  define  the  following 
operation: 

g2(x,S)  - gx  , gSg' 
g^(  p,  ^ ) - gp.  , g^s' 
gj^(a)  - a 

We  consider  the  statistic  F- and  shall  show  that  it  is  a 
maximal  invariant  with  respect  to  the  group  defined  above. 

F is  clearly  invariant;  to  shew  that  it  is  maxiasl,  we  must  show  that 
any  Invariant  function  depends  only  on  F.  Assume  that  z/O.  Let  (x^,S^) 
be  a fixed  elomorit  of  Z,  and  let  “ .^s;  z- 


orbit.  Then  K contains  an  element  of  the  form 
z 

o 


, I where  w > 0 


and  I is  the  identity.  This  is  so,  since: 

We  can  find  a g€^,  so  that  gS  g'  ~ I.  Also  there  is  an  orthogonal 


h€  with  hgx^- 


; hence  hgS^(bg)^"  hlh' ■ I and  thu.i  the  element 
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k-  maps  into 


Thus  fiinoe  invariant  functions  are  constant  over  each  orbit,  we  see 

that  an7  Invariant  function  depends  onlv  on  w. 

2 

Evaluating  w , with  the  above  h and  g,  we  have: 


w^  - x'g'h‘'[h(gSg')“^^1hgx 

- Pg'h'hg'“  WVhgT 

- 3r^S"^x 

2 — “1 

Hence  any  Invariant  function  depends  only  on  w } l.e.,  F~x'S  x Is  a 
■axlnal  invariant. 

Since  F,  properly  normalised,  has  a non-central  F distribution,  we 
can  again  obtain  the  characterisation  of  an  essentially  complete  class, 
as  a class  of  monotone  procedures  In  P. 
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